Genotyping‐by‐sequencing analysis of Orobanche crenata populations in Algeria reveals genetic differentiation

Abstract Crenate broomrape (Orobanche crenata Forsk.) is a serious long‐standing parasitic weed problem in Algeria, mainly affecting legumes but also vegetable crops. Unresolved questions for parasitic weeds revolve around the extent to which these plants undergo local adaptation, especially with respect to host specialization, which would be expected to be a strong selective factor for obligate parasitic plants. In the present study, the genotyping‐by‐sequencing (GBS) approach was used to analyze genetic diversity and population structure of 10 Northern Algerian O. crenata populations with different geographical origins and host species (faba bean, pea, chickpea, carrot, and tomato). In total, 8004 high‐quality single‐nucleotide polymorphisms (5% missingness) were obtained and used across the study. Genetic diversity and relationships of 95 individuals from 10 populations were studied using model‐based ancestry analysis, principal components analysis, discriminant analysis of principal components, and phylogeny approaches. The genetic differentiation (F ST) between pairs of populations was lower between adjacent populations and higher between geographically separated ones, but no support was found for isolation by distance. Further analyses identified four genetic clusters and revealed evidence of structuring among populations and, although confounded with location, among hosts. In the clearest example, O. crenata growing on pea had a SNP profile that was distinct from other host/location combinations. These results illustrate the importance and potential of GBS to reveal the dynamics of parasitic weed dispersal and population structure.

It is 1 of about 150 species in the genus Orobanche (Orobanchaceae) (Wolfe et al., 2005), which are notable for their parasitic mode of nutrition. Like some other members of this family, O. crenata lacks chlorophyll and photosynthetic capacity, so is completely dependent on autotrophic host plants for its nutritional requirements. The geographic distribution of the genus is mostly in the temperate and subtropical regions of the world, but centered in the Mediterranean area (Satovic et al., 2009;Zhang et al., 2014).
Orobanche crenata constitutes a major constraint to faba bean (Vicia faba L.) cultivation (Acharya, 2013;Pérez-de-Luque et al., 2010). However, this parasite also attacks crops such as lentil (Lens culinaris Medik.), pea (Pisum sativum L.), chickpea (Cicer arietinum L.), tomato (Solanum lycopersicum L.), lettuce (Lactuca sativa L.), and carrot (Daucus carota L.) (Aksoy et al., 2016;Renna et al., 2015;Román, Hernández, et al., 2007). Control of O. crenata is difficult due to its ability to produce high numbers of tiny seeds (up to 500,000 per plant) that can lie dormant in the soil for up to 20 years in the absence of a host (Habimana et al., 2014;Yahia et al., 2015). The parasite thus persists through seasons when hosts are not present, only to reappear when compatible host crops are replanted. Furthermore, the parasite is largely hidden below ground as the seedlings attach to host roots and inflict much of their damage to the host before the parasite floral shoot emerges from the soil. Several methods have been advocated for control of this weed, ranging from hand pulling, herbicides, biological control, delayed crop sowing, and crop rotation, but each of these suffer disadvantages due to economic constraints or limited effectiveness (Eizenberg et al., 2013;Kannan & Zwanenburg, 2014;Sheoran et al., 2016).
In Algeria, O. crenata is the major Orobanche species and is a serious problem for legume crops, mainly faba bean, pea, and chickpea. This parasite has been reported in several regions of Algeria, with high levels of infestation leading to the complete destruction of affected crops in some localities which force farmers to give up growing legume crops (Labrada, 2008). Orobanche crenata is a long-standing agricultural problem in Algeria. The oldest herbarium specimens date to 1908 and were collected from legume crops in the region of El-Harrach (previously called "Maison Carrée" during the French colonial period). History tells us of the extent Orobanche damage at the beginning of the last century. In 1923, Ducellier wrote the following: "Faba beans and peas cultivation is made impossible in certain localities of the Sahel of Algiers and of the plateau of 'Maison carrée', so much has become common there, in the last fifteen or twenty years, the crenate broomrape.″ At that time, the same author estimated that in some localities 60% of the land had become unsuitable for the cultivation of pea and faba bean as a result of the damage caused by this broomrape, which could lead to the complete crop failure (Blanchard, 1952). More than 70 years after Ducellier's statements, the Orobanche problem continues to increase. The parasite not only was reported to be still widespread in the Sahel of Algiers on legumes (Zermane, 1998) but also was found in the "Ain Dem" region (at "Khemis Méliana" town, about 200 km west of Algiers) causing significant losses on the same crops (Mahmoudi, 1993).
A previous study was aimed to understand the genetic diversity of this species in Algeria using RFLP and RAPD markers (Aouali et al., 2007). This showed a proportional increase in genetic distance with geographical distance and suggested that the center of dissemination for this parasitic plant might be the region of "Mitidja,″ which is near the Ain Taya (Algiers) location used in the present study ( Figure 1). Studies in other regions, employing ALFP and RAPD makers (Paran et al., 1997;Román et al., 2002) as well as microsatellites (SSR; Belay et al., 2020), have generally found high levels of amongpopulation individual variation and evidence of some degree of population structuring. In the case of comparisons across Spain and Israel (Román et al., 2002), in spite of evident gene flow, differentiation was found between countries and also among regions within, with populations in Israel showing greater differentiation than those found in Spain. Similarly, Belay et al. (2020) identified some population differentiation (two genetic clusters) among SSR markers in northern Ethiopia and gene flow among populations with little evidence of geographic separation.
In recent years, improved molecular techniques have been developed for genetic analysis of populations (Satovic et al., 2009).
Advances in next-generation sequencing technologies have enabled a revolution in genetic research through the ability to generate large numbers of single nucleotide polymorphisms (SNPs; Crossa et al., 2013). Genotyping by sequencing (GBS) is a high-throughput genotyping platform that integrates SNP discovery and genotype calling into one step by reducing genome complexity via restriction enzymes (Elshire et al., 2011). It is an attractive technology for genomic selection by providing new cost-effective opportunities for breeders because it generates large numbers of SNPs for exploring within-species diversity, constructing haplotype maps, genomewide association studies, and genomic selection (Poland & Rife, 2012). The reduced representation of the genome and the barcoding of each individual enable multiple samples to be sequenced in one lane, leading to low-cost genotyping of many individuals (Elshire et al., 2011).
Given the tremendous economic impact of O. crenata, the study of the genetic variation in this parasitic weed is important because it could lead to better understanding of O. crenata spread and adaptation. Furthermore, understanding how populations are genetically structured can provide some insight on how genetic variation in this species is shaped by evolution. In the present study, the GBS approach was used to identify and genotype SNPs in northern Algerian O. crenata populations that represent diversity in terms of geography and host species, with the aim to understand the population structure and geographical distribution.

| Genotyping by sequencing
Genomic DNA was extracted from floral buds using Qiagen DNeasy Plant Mini kit (QIAGEN Strasse 1, 40724 Hilden, Germany) following the manufacture's instruction. Samples were sent to the Institute of F I G U R E 1 Sampling locations for 10 collections of Orobanche crenata from north central Algeria used in this study. Two letter designations indicate the collection and colors indicate the host crop, dark green = faba bean, light green = pea, yellow with red outline = chickpea and tomato, orange = carrot (described in Table 1

| Sequencing data analysis and SNP calling
Raw sequence data were processed using the Universal Network-Enabled Analysis kit (UNEAK) pipeline implemented in the Iplant collaborative platform. This pipeline produced a hapmap file for downstream analysis. This file was used as input for SNP identification using the GBS pipeline implemented in TASSEL (Version: 3.0.166). Raw SNPs were filtered following the dDocent guidelines (Puritz et al., 2014). In short, using vcftools (Danecek et al., 2011) variants were filtered for depth >5, quality >Q30, and initially 50% missingness. This file was used to screen samples for high levels of missingness (all were <30%). The final SNP set was filtered for a maximum of 5% missing values and a minor allele frequency <0.05.

| Population structure
The number of genetic clusters across populations were identified using maximum-likelihood hierarchical clustering via ADMIXTURE v1.3.0 as well as principal components analysis, and discriminant analysis of principle components as implemented in poppr v2.9.0.
ADMIXTURE was implemented with 15 iterations for each k from 1 to 10. Cross-validation was used to identify the optimal k. Crossvalidation was also used to identify the optimal number of principal components via "xvalDapc″ in poppr, which were used to identify the number of clusters in the data via "find.clusters.″ All data were visualized using poppr or ggplot v3.3.3 (Wickham, 2016). All scripts necessary to reproduce these analyses can be found here: figshare link.

| Population differentiation
Population differentiation was estimated using Weir and Cockerham's (1984) F ST (Table 2) Nei's genetic distance (Nei, 1972) was calculated and used to construct a neighbor-joining tree ( Figure 2) which grouped samples by host plant with strong bootstrap support (99-100%).
Intriguingly, while carrot (AT) is a geographic outlier, tomato (TT) and chickpea (TH), which were collected in adjacent fields, were split with 100% bootstrap support. Faba bean host samples were split into four groups with moderate bootstrap support (95%).

| Analysis of molecular variance and isolation by distance
A hierarchical AMOVA, as implemented in poppr with method = "ade4,″ was carried out using population within host (Table 4). To test the role of isolation by distance (IBD), a Mantel test was conducted using Edward's genetic distance and a geographic distance matrix (latitude, longitude). Overall, we find no support for IBD (R = 0.55, p = .10) across populations. We do find, however, two distinct patches in the kernel density estimates for IBD (Figure 3).
This patchiness appears to be driven by the outlying population, AT, which exhibits moderate genetic differentiation, with mean F ST = 0.075 (Table 2) and geographic distance, mean = 1.94. This outlying patch drives a significant linear relationship between genetic distance and geographic distance (dashed line Figure 3, R 2 = 0.283, p = .0001).

| Population structure
The analysis of population structure was conducted using two

| DISCUSS ION
Despite the interest devoted to Orobanche spp. given their tremendous economic impact, knowledge of their genetic variability is still limited. Genetic diversity analysis is of great importance as it will facilitate understanding the genetic structure of parasitic weed populations. This knowledge will provide insights into parasite dispersal, host specialization, development of new races, and in establishing diverse collections of parasite races for use in crop breeding programs (Román, Hernández, et al., 2007;Román et al., 2002;Vaz Patto et al., 2008). However, accurate genetic diversity studies require powerful and reliable genetic tools, such as molecular markers.
Throughout the last two decades, several studies attempted to elucidate patterns of genetic variation in Orobanche spp. using different molecular markers such as RFLP (Vaz Patto et  spp. using GBS. Our work is therefore the first to use this approach for broomrape, generating 2935 high-quality SNP markers that provided substantially higher resolution relative to earlier approaches. In the present study, the genetic diversity and population structure of 10 O. crenata populations originating from different locations and crop hosts in Algeria were analyzed by GBS-SNPs.  (Table 2)  Previous studies in O. crenata populations from Morocco (Ennami, Briache, Mbasani Mansi, et al., 2017), Spain (Román et al., 2001(Román et al., , 2002, and Egypt (Abdalla et al., 2016) also reported a clear genetic variation at the intra-population level and only little differentiation among populations. According to Musselman (1986) and were recorded for the most geographically distant populations, in particular that of Ain Temouchent (AT), which is the farthest collection location. Genetic differentiation among populations is expected to increase with increased geographic distance (Slatkin, 1993).

| Genetic differentiation of populations
Analysis of the pairwise genetic differentiation in our study provides some evidence that genetic distance between populations increased proportionally with geographic distance. This was also the conclu-  (Román, Hernández, et al., 2007;Román et al., 2002;Stojanova et al., 2019).
Results from prior research support the observation that interpopulation differentiation is likely to be detected between distant countries rather than within countries (Satovic et al., 2009). In explaining what could be responsible for this trend, Romàn et al. (2001Romàn et al. ( , 2002 suggested that geographic distance provides a substantial barrier to gene flow as long as there is no commercial exchange of host seeds between the regions; whereas within a country migration forces between populations are continuous and strongly favored by an efficient dispersal of the parasite seeds via humans, machinery, animals, or wind, as well as on host seeds. crenata might be the region of "Mitidja." In addition, from data in Conversely, evidence of host differentiation has been found in other parasitic weeds, such as O. foetida Román, Hernández, et al., 2007;Vaz Patto et al., 2008, Thorogood et al., 2008but not Boukteb et al., 2021), Striga hermonthica (Unachukwu et al., 2017), and Phelipanche ramosa (Stojanova et al., 2019). These differences may be due to specific methods/sampling design or may arise from the type of genetic marker selected as single locus co-dominant markers are more efficacious for population biology insights (Sunnucks, 2000).
The present study provides relevant population genetic information that may benefit future breeding programs and management practices aimed at bolstering resistance against this parasitic weed.
Other aspects that are worth further investigation may include cross-infestation experiments to ascertain host preferences and specialization (Román, 2013;Stojanova et al., 2019). Also, it would be interesting to study genetic interactions between wild and weedy forms of O. crenata (Satovic et al., 2009 Algeria. Future studies are needed to identify the evolutionary processes shaping this differentiation. Research (MESRS -Algeria).

CO N FLI C T O F I NTE R E S T
All authors declare no conflicts of interest.

O PE N R E S E A RCH BA D G E S
This article has been awarded Open Materials, Open Data Badges.
All materials and data are publicly accessible via the Open Science Framework at https://doi.org/10.7294/14838213.